Purpose of this exercise: try to solve this question: *How many times did the Cadmium concentration in sediment exceed the safe limit (ERL; Lyons et al., 2017) in 2015?
We will download some EMODnet Biology species observations with the Download Toolbox:
Now we can start analyzing these data in R
First, we load some R packages that we will use in this exercise
library(dplyr) # package for data manipulation
library(data.table) # package for fast reading of csv files
library(sf) # simple features
library(mapview) # interactive maps
library(ggplot2) # plotting
read the data:
cd_data <- read.csv("data/20190208_Cd_data_ICES.csv")
head(cd_data)
It seems that there are multiple units in this dataset!
# it seems that there are different units used in this dataset!
# create table with unit occurrences
table(cd_data$Unit)
##
## g/g mg/kg ug/g ug/kg
## 320 250 167 3820
Ok, let’s do a conversion by using some dplyr functions.
The %>% is the ‘pipe’ operator, and is used to combine multiple actions This makes it easier to read:
x %>% y %>% z, means
‘take data x’, do y with it and with this result, do z
# manual conversion
cd_data <- cd_data %>%
mutate(ugkg = case_when( Unit == 'ug/g' ~ Value * 1000,
Unit == 'mg/kg' ~ Value * 1000,
Unit == 'g/g' ~ Value * 1E9,
Unit == 'ug/kg' ~ Value
),
Exceeded = as.numeric(ugkg >= 1200)
)
Make the data spatially aware (convert to ‘sf’ object):
cd_data_sf <- st_as_sf(cd_data,
coords=c("Longitude..degrees_east.","Latitude..degrees_north."),
crs = 4326, # WGS84
remove = FALSE)
Plot the data:
mapview(cd_data_sf, zcol = "Exceeded",
viewer.suppress = FALSE)